Estimate General Practitioners Active Supply in Iran: Capture-Recapture Method for Three Data Sources

Background: Accurate estimation of active general practitioners (GPs) is a concern for health authorities to estimate requirements. This study aimed to accurately estimate GPs active supply in Iran using three sources capture-recapture (CRC) method. Methods: This cross-sectional study collected data during 2015–2016, targeting all GPs registered in three independent data sources; a national survey from all hospitals, database of human resource management office at health ministry and physicians’ offices databank. Variables including medical council codes, GP names, surnames and national ID codes were used for data linkage among the three sources. Three sources CRC method was applied using log-linear models to estimate the total number of active GPs in STATA software. Results: Overall, 27,048 GPs were identified after removing the duplicate records. Based on CRC three sources data, the total number of GPs were 53,630 in 2015–2016. Distribution of GPs per 1,000 population among the provinces indicates that provinces of Kohgiluyeh & Boyer Ahmad, Mazandaran, Golestan and Yazd with ratios of 1.28, 1.28, 1.21 and 1.17 physicians rank the highest proportion of GPs and the provinces of Sistan & Baluchestan, Ilam, Zanjan, Alborz, North Khorasan with corresponding ratios of 0.24, 0.40, 0.40, 0.43 and 0.45 GPs ranked the lowest. Conclusion: CRC method is known to be the best and rapidest method to estimate active GP due to its compatibility for the current situation of databanks in Iran. Therefore, this method is a good application in human resource distribution and planning.


Introduction
General practitioners (GP) are the first providers of medical services for a community and as referral agents they have a significant role in creation of community satisfaction (1). Therefore, the gate keeping role of a GP is one of the most important criteria for a comprehensive and strong healthcare system (2). However, in case of Iran, there is no accurate estimate of active number of GPs due to lack of a centralized and updated registration system. lack of knowledge about the number of active GPs and existence of contradictory statistics in the early years of implementing family physician program in Iran has created concerns about its success (3,4). These concerns have raised questions about increase in GPs unemployment due to unrealistic statistics of 80 thousand GPs in the national Medical Council Bank. Moreover, implementation of family physician plan seemed impossible thanks to "misleading statistics" of other data banks underestimating the actual number of active physicians required to fulfill the plan (4,5). The main reason for this "Bubble Hit" might be incomprehensive detection and registration of active physicians or lack of information about dynamics of GPs. According to a few existing studies (5,6), a large number of GPs registered in the medical council databank have chosen to engage in other activities such as administrative or managerial duties at the ministry headquarter or they have left medical practice due to migration, pursuing studies for specialization, retirement or irrelevant fields to medicine such as medical equipment, pharmaceutical, beauty and fitness care and etc. (6,7). However, due to lack of one centralized and updated registration system, there is no confidence in such reports and therefore, in terms of having accurate information, it has become a serious issue for planners and policy makers. In developed countries, active physicians are estimated through surveys and data matching models because of having valid data banks. In two studies (8,9) databanks were matched to estimate active num-ber of health workforce while other studies used a combination of matching databanks and conducting surveys (10). However, databanks of active doctors in Iran are decentralized and they independently relate to different locations. Moreover, they only cover a fraction of active GPs. Recently; capturerecapture (CRC) method has been increasingly favored in various epidemiologic researches (11). Primarily, CRC method was used to estimate animal population (12); then it evolved to other epidemiologic subjects to estimate disease prevalence or any situation in which available data sources are incomplete. Some assumptions have to be considered for this method, for example the sources are supposed independent and all individuals inside the same source have an equal opportunity to be included. Due to its compatibility with the current situation of databanks in Iran, this study used CRC method as a novel way to estimate active number of GPs and their distribution in the country.

Three Sources Data Collection
This research was a cross-sectional study conducted in 2015-2016 targeting all GPs in Iran. It utilized data from three independent sources including; a survey among all hospitals in Iran, Ministry of Health and Medical Education (MOHME) database and physicians office data bank from GPs' licensing office of MOHME. These sources were matched to identify number of active GPs. The national hospitals survey gathered data from 925 hospitals in 2015-2016 using personnel records of each hospital. Human Resources Management (HRM) office in MOHME had a nationally registered database consisting of a list of GPs employed by MOHME. This source is updated on a regular basis upon changes in GPs' service delivery locations. Physicians' offices databank was the third nationally registered data source that renews li-cense number of GPs participated in annual trainings for renewal of practice permits and enhancing professional skills in consecutive years. Moreover, data on national population was obtained from Iran's statistics center. Three main steps during the national survey of hospitals were data collection, assessment of data quality, and gathering missed data. In the first step, an Excel form together with instruction was sent to the medical universities across the provinces. Requested data on GPs, were about GPs demographic characteristics, national identification code, medical council code, hospital name, province, town. The blank fields of this step were filled by matching medical council national data bank through SQL capacity in Microsoft Office Access. Secondly, quality of the data was evaluated in terms of accuracy. In order to optimize data collection, medical universities introduced focal points as liaisons with the research team to maintain contact on the progress. In case of any issues, the hospitals were reminded by an official letter and direct phone calls to the focal points.

Linking data among different sources
Medical council code of each GP was used to link the three data sources in SQL capacity of Microsoft Office Access and duplicate records were removed. Afterward, records without a medical council code were sought by matching other variables such as national ID codes, name and last name with medical council reference bank which has all data of medical school graduates. Thereby, the missing medical council codes were extracted and complete.

Statistical analysis
Three sources CRC method was applied using log-linear models to estimate the total number of active GPs in Iran. Data linkage in previous step identified records of similar variable and it prepared a bank for analysis stage. However, CRC analysis required the following assumptions; data sources were independent and all the GPs had an equal inclusion opportunity (11). Adoption of CRC method had the advantage of more accurate estimation of active GPs, completeness of every data source registries, and finding GPs to population ratio in Iran. Akaike's Information Criterion (AIC), Bayesian Information Criteria (BIC), and G2 test also called log likelihood-ratio were used to assess the goodness of fit in model selection. In turn, the best log-linear model with a lower AIC was selected. All statistical calculations were performed in STATA software version 12 (StataCorp, Texas, USA).

Results
Overall, 838 hospitals (91%) out of 925 hospitals in Iran participated in the survey. Forty-five (4.86% of) hospitals refused to participate and 42 (4.5% of) hospitals did not answer. Overall, 27,048 GPs were identified after removing duplicate records. A Venn diagram below shows the details of common GPs between the three data sources (Fig. 1). Data records showed that MOHME HRM office had the largest number of GPs (16,381) while the hospitals survey and the GPs' licensing office indicated lower numbers 6,986 and 8,837 respectively. 14,609 (54.30% of) GPs were male, the mean age of GPs was 42.24 (±10.32) for men and 37.24 (±8.83) for women. Public sector GPs comprise 69% of all data and as Table 1 shows demographic characteristics, majority of the reported GPs (40%-60%) were in ages 45 to 55.

GPs Supply Estimation Using CRC Method
The best log-linear model with three two-way interactions between resources (  (Table 3).  Comparison of current number of GPs to the estimated ones is shown in Table 3. The greatest difference (more than 60%) between number of current GPs and the estimated ones related to provinces of Kohgiluyeh &Boyer Ahmad, Golestan and Qazvin, while the lowest difference (less than 20%) was amongst the provinces of Alborz, Ilam and Sistan & Baluchistan.    Figure  2 elaborates more on the ranking of provinces.

Discussion
This study aimed to achieve a more accurate estimate of active general practitioners and their distribution across Iran. Overall, 53,630 GPs were active in the country and the existing data banks only showed approximately 50% of the active GPs. Many GPs including those active in the private sector; offices, clinics and other governmental sectors were not registered due to lack of a centralized and up-to-data registration system. The medical council showed 80 thousand record as the most comprehensive existing database for the clinical health workforce, especially for GPs (4,5). Current estimates showed a 40 thousand difference in number of GPs owing to lack of knowledge on outflow dynamics of GPs because of immigration, unemployment or employment in irrelevant sectors or fields during the past years. The focus has been on inflow of workforce after graduation and changes have not been applied. Due to above limitations mentioned for medical council bank, it might be inappropriate to show GPs distribution in Iran. Therefore, findings of earlier studies using medical council bank should be considered with cution. Furthermore, studies not appling medical council bank presented no comprehensive information because other banks such as human resource bank contain no complete data about whole number of GPs. For example, two studies (13,14) that used medical council bank and static center data showed only distribution of public GPs. In addition, another study considered only GPs working on the ministry of health (12). Estimated results of this study show 0.76 GPs per 1,000 population. This ratio in South-East Asia Region (SEARO); Bangladesh, Bhutan, Indonesia, Nepal, Timor-Leste were 0.41, 0.12, 0.11, 0.09, 0.07 respectively and they are significantly lower than Iran (15)(16)(17)(18). Ratios of health professionals were lower in Southeast Asia and it has nearly 40% of diseases burden considering failure of preventive plans for communicable diseases (19,20 (22,23). Although they had a significantly higher ratio of GP to population in primary care, these countries face the challenge of an aging population and reducing duration of hospitalization due to development level and better economic status. Nonetheless, in some developed countries such as Germany, Sweden, England, Netherlands, and Denmark (22,24,25) this ratio was lower (0.51, 0.60, 0.67, 0.73 and 0.76) and it was 0.70 in Turkey (26). In these countries, some primary care services were delivered in family health centers by family physicians (27)(28)(29) who were specialists; hence there were fewer GPs. Comparison of GPs ratio in Iran showed a relatively good condition as a developing country, but it still had a significant difference with developed countries. Provincial estimation of GPs revealed imbalanced distribution across the country. Underserved provinces of western and south eastern areas had lower ratios similar to South East Asia. In these areas of Iran including Hormozgan province, burden of disease was higher and there was a higher need for healthcare services (30), whereas in advantaged areas, in central and northern regions, this ratio was higher and similar to developed countries and in some case even higher than Nordics countries. Therefore, distribution imbalance has resulted in surplus in advantaged areas with lower need while despite high demand in underserved areas, they had to cope with shortage of skilled GPs (31). This imbalance depends on factors such as financial incentives, life quality and other personal preferences (32). According to WHO, equity in service delivery will be challenged when countries with the lowest need have the most health workers, while those with the greatest burden of disease had the least number (33).

Conclusion
We noticed a significant difference in number of active GPs in Iran and developing countries. Regarding specialists' shortage, it is necessary to have an estimate of potential GPs for education in residency. In addition, quantity of health workforce in a country is as important as their distribution amongst different regions considering their demographic, epidemiologic and pathologic characteristics. Therefore, absolutely essential to adopt new approaches for accurate estimation of GPs requirement in national scope and their distribution into underserved areas which suffer from shortage of GPs and greater demand for health services.

Ethical considerations
Ethical issues (Including plagiarism, informed consent, misconduct, data fabrication and/or falsification, double publication and/or submission, redundancy, etc.) have been completely observed by the authors.